NVIDIA and OpenAI Unveil Breakthrough AI Performance with GB200 NVL72 and gpt-oss Models
NVIDIA has partnered with OpenAI to push the boundaries of AI capabilities, achieving unprecedented processing speeds of up to 1.5 million tokens per second (TPS) using the GB200 NVL72 system. The newly launched gpt-oss-20b and gpt-oss-120b models leverage a mixture of experts (MoE) architecture with SwigGLU activations, optimized for NVIDIA's Blackwell platform. These models support a 128k context length and are released in FP4 precision, tailored for data center GPUs.
The collaboration extends to open-source frameworks like Hugging Face Transformers and Nvidia TensorRT-LLM, enhancing accessibility for developers. Training the gpt-oss-120b model alone consumed over 2.1 million GPU hours, underscoring the computational intensity behind these advancements.